Automated gating was able to match the performance of central manual analysis for all tested panels, exhibiting little to no bias and comparable variability. Standardized staining, data collection, and automated gating can increase power, reduce variability, and streamline analysis for immunophenotyping.
The two top performing gating algorithms - OpenCyto (v. 1.7.4), flowDensity (v. 1.4.0) - in a study run by the FlowCAP consortium aimed at selecting the best performing algorithms for this larger study were chosen for the analysis presented in this paper.
Standardizing Flow Cytometry Immunophenotyping Analysis from the Human ImmunoPhenotyping Consortium Finak, Langweiler, Jaimes, et al. (2016)
OpenCyto is an analysis framework designed to automate the accurate gating of flow cytometry data with limited bias (Finak, Frelinger, Jiang, et al. (2014), Finak, Langweiler, Jaimes, et al. (2016)). We propose to use OpenCyto to perform systematic and reproducible gating of 28 immune cell subsets. Gating is standardized via a .csv file template describing the algorithmic approach for each step of the gating hierarchy. Importantly, this methodology allows for the gating of thousands of samples producing interpretable and labelled populations
OpenCyto gives the user many options to refine algorithmic parameters to improve the performance of each step in the gating hierarchy. We evaluated the performance of our OpenCyto template using internal data for 151 manually gated (Jflow software) samples across 15 gates. The global correlation between the population counts of manual and OpenCyto gating was high (rho=0.9846, p-value <2e-16). Despite a high global concordance, certain subsets were less well correlated (Activated CD4 counts, rho=0.6222, p-value <2e-16).
While OpenCyto can automate the classification of known subsets by following a traditional gating hierarchy, it does not easily facilitate the discovery of novel populations.
We propose two methods for unsupervised clustering of high dimensional flow cytometry data to search for novel cell subsets that may discriminate case/control status. For both methods, we will use OpenCyto to first limit our search space (e.g starting from live, single T-Cells) and then search for novel populations within the clean subset.
Citrus (cluster identification, characterization, and regression) Bruggner, Bodenmiller, Dill, et al. (2014) is specifically designed to find cell subsets that can predict case/control status and provides the user with diagnostic plots detailing the predictive accuracy of any subsets discovered. A particular limitation of Citrus is the assumption that predictive subsets will be present in a large percentage (default 5%) of all study events. This may limit the use of Citrus for detection of rare but differential subsets.
PhenoGraph (Levine, Simonds, Bendall, et al., 2015) also performs unsupervised clustering of high dimensional single cell data and is capable of identifying subsets present in as few as 1/2,000 cells. Since PhenoGraph clusters are not immediately interpretable (labelled as 1, 2, 3,etc), we will need a method to interpret and compare cases to controls / find distinguishing populations / determine what is novel.
t-SNE may not be totally necessary/feasible on the 1K+ sample scale, but looks nice
Lastly, any novel and discriminating populations detected will visualized by collapsing the data to 2D space using t-SNE. t-SNE provides a comprehensive view of the dataset and can aid in visually assessing the distinctness of a novel cluster across many dimensions.
Or:
we will compute the marker enrichment modeling (MEM) Diggins, Greenplate, Leelatian, et al. (2017) score of each PhenoGraph cluster detected. MEM scores provide a quantitative description of features relative to a reference population.
| unique(data_xk_all$V1) |
|---|
| Lymphocytes (SSC-A v FSC-A)=1 |
| central memory cytotoxic Tcells (CCR7+ , CD45RA-)=15 |
| activated helper Tcells (CD4+ HLA-DR+)=17 |
| effector memory cytotoxic Tcells (CD95+ CD28-)=18 |
| central memory helper Tcells (CD95+, CD28+)=19 |
| Single Cells (FSC-H v FSC-W)=2 |
| effector memory helper Tcells (CD95+, CD28-)=20 |
| activated cytotoxic Tcells (CD8+ HLA-DR+)=21 |
| EM2 cytotoxic Tcells (CD27+ CD28-)=22 |
| EM4 cytotoxic Tcells (CD27- CD28+)=23 |
| pE1 cytotoxic Tcells (CD27+ CD28+)=24 |
| effector memory helper Tcells (CCR7- CD45RA-)=25 |
| naive cytotoxic Tcells (CD95- CD28+)=26 |
| naive helper Tcells (CCR7+ CD45RA+)=27 |
| EM3 cytotoxic Tcells (CD27- CD28-)=28 |
| pE cytotoxic Tcells (CD27- CD28-)=29 |
| Live cells (PE-)=3 |
| naive cytotoxic Tcells (CCR7+ , CD45RA+)=30 |
| central memory cytotoxic Tcells (CD95+ CD28+)=31 |
| EM1 cytotoxic Tcells (CD27+ CD28+)=32 |
| pE2 cytotoxic Tcells (CD27+ , CD28-)=33 |
| effector helper Tcells (CCR7- CD45RA+)=34 |
| cytotoxic Tcells CD27- , CD28+=36 |
| central memory helper Tcells (CCR7+ CD45RA-)=37 |
| naive helper Tcells (CD95-, CD28+)=38 |
| Tcells (CD3+ CD19-)=6 |
| cytotoxic Tcells-CD8+=7 |